Study on Hadoop and MapReduce Framework
نویسندگان
چکیده
Hadoop, a Java Software Framework, supports data intensive data-intensive distributed applications. Hadoop is developed under open source license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop has formed framework for Big Data analysis. Its MapReduce technique made it more useful for huge amout of data processing. Hadoop is incorporated with cloud computing. In this paper, we described about the introduction of Hadoop, MapReduce framework, application and real time users of Hadoop. KeywordsMapReduce, Yarn, HDFS, Mapper, Reducer, Job Tasker, Task Tracker, Key, Values.
منابع مشابه
Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملCloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کاملAn Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework
The most popular open source distributed computing framework called Hadoop was designed by Doug Cutting and his team, which involves thousands of nodes to process and analyze huge amounts of data called Big Data. The major core components of Hadoop are HDFS (Hadoop Distributed File System) and MapReduce. This framework is the most popular and powerful for store, manage and process Big Data appl...
متن کاملBenchmarking and Performance studies of MapReduce / Hadoop Framework on Blue Waters Supercomputer
MapReduce is an emerging and widely used programming model for large-scale data parallel applications that require to process large amount of raw data. There are several implementations of MapReduce framework, among which Apache Hadoop is the most commonly used and open source implementaion. These frameworks are rarely deployed on supercomputers as massive as Blue Waters. We want to evaluate ho...
متن کاملA Survey on MapReduce Performance and Hadoop Acceleration
MapReduce is implementation for generating large data sets with a parallel, distributed algorithm on a cluster. Hadoop is open source implementation of the MapReduce programming datamodel used for large-scale parallel applications such as web indexing, data mining, and scientific simulation. Hadoop-A framework is able to levitate Hadoop acceleration and give significant performance compared to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015